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In this paper, the Pareto-optimal beam structure for multi-user multiple-input multiple-output (MIMO) 
interference channels is investigated and a necessary condition for any Pareto-optimal transmit signal 
co variance matrix is presented for the A'-pair Gaussian [N, Mi,-- - ,Mk) interference channel. It is 
\ shown that any Pareto-optimal transmit signal covariance matrix at a transmitter should have its column 

space contained in the union of the eigen-spaces of the channel matrices from the transmitter to all 
receivers. Based on this necessary condition, an efficient parameterization for the beam search space is 
" proposed. The proposed parameterization is given by the product manifold of a Stiefel manifold and a 



subset of a hyperplane and enables us to construct a very efficient beam design algorithm by exploiting 
its rich geometrical structure and existing tools for optimization on Stiefel manifolds. Reduction in the 
beam search space dimension and computational complexity by the proposed parameterization and the 
proposed beam design approach is significant when the number of transmit antennas is larger than the 
sum of the numbers of receive antennas, as in upcoming cellular networks adopting massive MIMO 
technologies. Numerical results validate the proposed parameterization and the proposed cooperative 
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£2 ■ beam design method based on the parameterization for MIMO interference channels. 
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I. Introduction 

Multi-user multiple antenna interference channels have gained intensive interest from research com- 
munities in recent years because of the significance of proper interference control in current and future 
wireless networks. One of the break-through results in this area is interference alignment by Cadambe 
and Jafar [fl]], which provides an effective way to achieving maximum degrees-of-freedom (DoF) for 
MIMO interference channels. However, interference alignment is only DoF optimal, i.e., it is optimal 
at high signal-to-noise ratio (SNR), whereas in typical cellular networks most receivers experiencing 
severe interference are located at cell edges and hence operate in the low or intermediate SNR regime. 
Thus, Jorswieck et al. investigated the multiple antenna interference channel problem from a different 
perspective based on Pareto-optimality (2). The framework of Pareto-optimality is especially useful for 
interference channels since the users in an interference channel basically form a group for negotiation. 
Under this framework, Jorswieck et al. showed for multiple-input single-output (MISO) interference 
channels that any Pareto-optimal beam vector at a transmitter is a normalized convex combination of 
the zero-forcing (ZF) beam vector and the matched-filtering (MF) beam vector in the case of two users 
and a linear combination of the channel vectors from the transmitter to all receivers in the general 
case of an arbitrary number of users. Their result and subsequent results by other researchers provide 
useful parameterizations for the optimal beam search space for efficient cooperative beam design in MISO 
interference channels Q-JH. However, not many results for the Pareto-optimal beam structure for MIMO 
interference channels are available, although there exist some results in limited circumstances 191- fTTl . 

In this paper, we provide a necessary condition for Pareto-optimal beamformers for the K-pair Gaus- 
sian (N,Mi, ■ ■ ■ ,Mk) interference channel which can model general MIMO interference channels, 
and show that any Pareto-optimal transmit signal covariance matrix at a transmitter should have its 
column space contained in the union of the eigen-spaces of the channel matrices from the transmitter 
to all receivers. Based on this, we provide an efficient parameterization for the beam search space not 
missing Pareto-optimality whose dimension is independent of the number N of transmit antennas and is 
determined only by (Mi, • • • , Mk), when N > Yl^=i Mi. The proposed parameterization is given by the 
product manifold of a Stiefel manifold and a subset of a hyperplane and enables us to construct a very 
efficient cooperative beam design algorithm by exploiting its rich geometrical structure and existing tools 
for optimization on Stiefel manifolds. Reduction in the beam search space dimension and computational 

'in the K-pah Gaussian (TV, Mi,-- - , Mk) channel, we have K transmitter-receiver pairs, and every transmitter has N 
transmit antennas and receiver i has Mi (€ {1, • • ■ , N}) receive antennas. 



November 20, 2012 



DRAFT 



SUBMITTED TO IEEE TRANSACTIONS ON SIGNAL PROCESSING, NOV. 17, 2012 



3 



complexity by the proposed parameterization and the proposed beam design algorithm is significant, 
wheniV » Ei=i M i 

as in upcoming cellular systems adopting massive MIMO technologies 11121 . 1131 . 
Furthermore, the proposed beam design algorithm does not need to fix the number of data streams for 
transmission beforehand and it finds an (locally) optimal DoF for a given finite SNR. This is beneficial 
because the optimal DoF is not known for a finite SNR in most cases. 

Notations and Organization In this paper, we will make use of standard notational conventions. 
Vectors and matrices are written in boldface with matrices in capitals. All vectors are column vectors. 
For a matrix A, A^, || A||, tr(A), and | A| indicate the Hermitian transpose, 2-norm, trace, and determinant 
of A, respectively. Ay or [A]jj denotes the element in the i-th row and the j-th column of A. C(A) 
denotes the column space of A and C _L (A) denotes the orthogonal complement of C(A). P/;(v) denotes 
the orthogonal projection of a vector v onto a linear subspace C IIa = A(A^A) _1 A^ represents 
the orthogonal projection onto C(A) and 11^ = I — IIa- For matrices A and B, A > B means that 
A — B is positive semi-definite. I„ stands for the identity matrix of size n (the subscript is omitted 
when unnecessary), [ai, • • • , ax,] or [a^]^ denotes the matrix composed of vectors ai, - - , ax and 
diag(ai, • • • , d n ) denotes the diagonal matrix with elements a\, ■ • ■ , a n . x ~ £/V(/x, S) means that x is 
circularly-symmetric complex Gaussian-distributed with mean vector /i and covariance matrix S. R, R + , 
and C denote the sets of real numbers, non-negative real numbers, and complex numbers, respectively. 
R n denotes the n-dimensional Euclidean space and C n denotes the vector space of all complex n-tuples. 
C nxp is the set of all n x p matrices with complex elements. For a complex number a, Re{a} denotes 
the real part of a. 

The remainder of this paper is organized as follows. The system model is described in Section UU. In 
Section [TlTJ a necessary condition and a parameterization for Pareto-optimal transmit beamformers for 
MIMO interference channels are provided. In Section [IV] a beam design algorithm under the obtained 
parameterization is presented. Numerical results are provided in Section [V] followed by conclusions in 
Section EH 

II. System Model 

In this paper, we consider a Gaussian interference channel with K transmitter-receiver pairs, where 
every transmitter has N transmit antennas and receiver i has Mj receive antennas. We assume that > 1, 
i = 1, • • ■ ,K, and N > max{Mi, • • • , Mk}- Due to interference from the unwanted transmitters, the 
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received signal vector yi at receiver i is given by 

K 

yi = HuSi + ^2 H y s j + n i> (!) 

where Hij denotes the Mj x N channel matrix from transmitter j to receiver i; Sj is the N x 1 transmit 
signal vector at transmitter j generated from Gaussian distribution CA/"(0, Qj); and rij is the additive 
Gaussian noise vector at receiver i with distribution £A/"(0, 1). Here, the transmit signal covariance matrix 
Qj (= E{sjS^}) at transmitter j is chosen among the feasible set 

Qj := {Q G C 7 ^ : Q ^ 0, tr(Q) < Pj, and 1 < rank(Q) < Mj}, (2) 

where the rank constraint is imposed to guarantee that the number of transmitted data streams is at 
least one and is less than or equal to the possible maximum Mj = min{Mj,iV} for transmitter j, 
j = 1,- -,K. Note that any value of degree-of-freedom (DoF) from one to the maximum Mj is 
feasible within the feasible set Qj. From here on, we will call the considered MIMO interference channel 
the K-pair Gaussian (iV, Mi,-- - ,Mk) MIMO interference channel. The considered i^-pair Gaussian 
(N, Mi,-- - , Mr) MIMO interference channel model is especially useful for downlink cooperative 
transmit beamforming in cellular systems. In the cellular downlink case, the transmitters, i.e., basestations 
can be equipped with many transmit antennas and the number of transmit antennas can be set to be the 
same in the phase of network design. On the other hand, each receiver, i.e., a mobile station has one or two 
receive antennas and furthermore the receivers forming a cooperative beamforming group together with 
the cooperating basestations may not have the same number of antennas. The i^-pair (N, Mi, • • • , Mr-) 
MIMO interference channel model fits this situation exactly. 

Due to the assumption of N > max{Mj,i = 1, • • • , ,K}, the Mi x N channel matrix Hy is a fat 
matrix (i.e., the number of its columns is larger than or equal to that of its rows) and its singular value 
decomposition (SVD) is given by 

H„ I-'., 0][V|I,V^, (3) 

where Ujj 6 C M * xM< is a unitary matrix; G C MiXMl is a diagonal matrix composed of the singular 
values of H^; V"]'- G C WxMi is a submatrix composed of orthonormal column vectors that span the 
eigen-space of H^; and G (^Nx(n~au) ls a su bmatrix composed of orthonormal column vectors 
that span the zero-forcing space of Hjj. Thus, HjjV^ ^ and H^V^ = 0. From here on, we shall 
refer to C(vjj) and C(V^) as the parallel and vertical spaces of (or simply Hjj with slight abuse 
of notation), respectively. For the purpose of beam design in later sections, we assume that the channel 
information is known to all the transmitters. 
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Under the assumption that interference is treated as noise at each receiver, for a given set of transmit 
signal covariance matrices {Qi,--- ,Qk} and a given set of realized channel matrices {Hij,i,j = 
1, ■ ■ ■ , K}, the rate of the i-th transmitter-receiver pair is given by 

Ri({Qu--- ,Qjc})=Iog|l+(l + X) H iiQi H ?)~ 1 HiiQiH,f| (4) 

for i = 1, • • ■ , K. Then, for the given set of realized channel matrices, the achievable rate region of the 
MIMO interference channel with interference treated as noise is defined as the union of rate-tuples that 
can be achieved by all possible combinations of transmit covariance manices: 

n ■.= |J (fli({Qi, • • • , q k ), Rk{Qu ■■■ , Qk)). (5) 

{Q;: QieQi,l<i<K} 

The outer boundary of the rate region 1Z in the first quadrant is called the Pareto boundary of 1Z and it 
consists of rate-tuples for which the rate of any one user cannot be increased without decreasing the rate 
of at least one other user. 

In the rest of this paper, we shall investigate the Pareto-optimal transmit beam structure for the K-pak 
Gaussian (N, Mi, • • • , Mjc) MIMO interference channel and develop an efficient beam design algorithm 
based on the obtained Pareto-optimal beam structure. 

III. A Necessary Condition for Pareto-Optimality for Transmit Beamforming in 

MIMO Interference Channels 

In this section, we provide a necessary condition for Pareto-optimal transmit covariance matrices for 
the A'-pair Gaussian (7Y, Mi, • • • , Mr-) MIMO interference channel, which reveal the structure of Pareto- 
optimal transmit beamformers. The necessary condition is given in the following theorem. 

Theorem 1: For the if-pair Gaussian (N,Mi,--- ,Mjf) MIMO interference channel in which the 
channel matrices {Hjj} are randomly realized and interference is treated as noise at each receiver, any 
Pareto-optimal transmit signal covariance matrix Q* at transmitter i should satisfy 

C(Q*)CCavS,,---,Vy)=C([H^,--- ,Hfj) in all cases (6) 

and 

tr(Q*) = Pi in the case that N > Y$Li Mi. (7) 



November 20, 2012 



DRAFT 



SUBMITTED TO IEEE TRANSACTIONS ON SIGNAL PROCESSING, NOV. 17, 2012 



6 



Proof: First, we consider the case that N > J2iLi^i- Suppose that the matrix [V^, ■•■ , V^J 6 
£NxJ2i Mi h as j-ank m (< _/\r). ^Then, there exists an orthonormal basis {ui}^ m that spans C ± ([v| i , • • • , 



N-m\ 



^([V^,---,Vy)=C({uJ-- 



(8) 



Now, suppose that a set of covariance matrices {Qi, i = 1, • • • ,K} is Pareto-optimal (i.e., it achieves a 
Pareto boundary point of the achievable rate region TZ) and that C(Qj) 2 ^([v|., • • • , V|Lj) at transmitter 
i. Then, we can express Qi as 



N—m 



(9) 



z=l 



where X; )>= 0, tr(Qi) < Pi, and a 2 = uf Qm. Here, C(Q;) £ C([vf-, • • • , V^J) implies that a 2 7^ 
for some z* e {1, • ■ • , N — m}. Let i be such an index and let 



Q: := Q, - afujuf 



(10) 



with a? / 0. Then, tr(Q-) = tr(Qj) - a? < tr(Qj) < P and Q- is positive semi-definitel] Thus, Q. is a 
valid transmit signal covariance matrix. Now consider the rate-tuple that is achieved by {Qi, • • • , Q^, • • • , Qk}- 
Let the interference covariance matrix at receiver i be denoted by 

<!>,:= I + ^H jfc Q fc H^. (11) 

Then, with the new set of transmit signal covariance matrices, the rate of the i-th transmitter-receiver 
pair is given by 



RittQi, •••,<&•••, Q K }) = log 1 + <i>, : h„q:ii 



(a) 



log 
log 



I + ^Hi^Qi-afu^uf)^ 

1 *. 'Hi-Q.li" 



RidQi,--- Qh ,Qk}), 



(12) 



2 When m = N, the condition l|6} is trivially satisfied since the channel matrices are randomly realized and thus [Vf 



i> ' * Kil 



spans the whole space. 

3 The positive semi-definiteness of Q; can be shown as in 11141 . First, u^Q^uj = uF(Qj — a?UjuF)lij = u^QiUj — 
ct? 1 1 uj 1 1 2 — by the definition of a? and ||u;|| = 1. For any vector w orthogonal to u^, we have w^Q^w — 
w" (Qi — a?ujuF)w = w H QiW > by the positive semi-definiteness of Q^. Since any vector in is contained in 

cflvL ■ ■ ■ . v Li)ffiC({ u '}£ m )' The daim foll ° ws - 



November 20, 2012 



DRAFT 



SUBMITTED TO IEEE TRANSACTIONS ON SIGNAL PROCESSING, NOV. 17, 2012 



where step (a) holds because 6 C _L ([V^, • • • , Vjj^J) and hence HjjUj = 0. Similarly, the rate of the 
j-th transmitter-receiver pair (j ^ i) is given by 



^■({Qi,--- Mi 



, Q^}) = log I + (I + H iJfc Q fc Hg + UjiQ'iU'/,) HjjQjHfj 

I + (*,- - ofH^ujuf Hg-^-Qj-Hg 



(6) 



log 
log 



I + *riH jiQj Hg 



i?,({Q 



1 ) ) Qi j ' ' ' ) 



CM), 



(13) 



where step (b) holds again because u~ { € C ± ([V^ i , ••• , vj^J) and hence Hj,uj = 0. Therefore, the 
rate-tuple does not change by replacing {Qi, • • • , Qj, • • • , Q^-} with {Qi, • • • , Q^, • • • , Qk}- 
Now, construct another transmit signal covariance matrix Q" as 

Q'/:=Q' t + 5vv^, (14) 

where v satisfies HjjV ^ while HjjV = for all j ^ i. Such v exists almost surely in C([v| i5 • • • , vjj^ J) 

(i.e., v G C(v|)p| I (J C(V^) J ) for randomly realized channel matrices, because the event C(v|) C (J C(v|j 



has measure zeroO Here, 5 > is chosen so that 6 < tr( - V v«) (Pi — tr(QDj ( trns is possible since 
tr(Q9 < tr(Qi) < P;. See CE3.) and 



tr(Qj') = tr(Q; + <5vv H ) < tr(Qj) + (Pi - tr(Qj)) = P 4 . 



(15) 



Thus, is a valid transmit signal covariance matrix. Now consider the rate-tuple that is achieved by 
{Qi> • ■ 1 i Qi' ; ■ ■ ■ > Qk}- Here, we define 



*,:=I+ £ Hj-fcQfcHg + Hj-iQ^H 



(16) 



4 The dimension of C(V^) is Mi (> 1) and the dimension of [J C(V'' i ) is at most Y^-iM Mj which is strictly less than N 
by the assumption Yl- Mi < N. The probability that a randomly realized subspace of is contained in another randomly 
realized subspace of with dimension strictly less than TV is zero. 
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Then, the rate of the j-th transmitter-receiver pair receiver (j ^ i) is given by 



Rj({Qi, ■■■ , Q'{, ■■■ , Q K }) = log |l + (I + J2 h j*Q* h 5 + HjiQj'Hj) ^QjHg 



(«0 



log 
log 



I + (vp ; + (JHjjVV^Hjf) 1 H J jQ_jH^ 



i + *7 1 H JJ Q i H^ 



(«9 



^•({Qi,-- - ,<&■■■ ,Qk}) 
RjdQi,--- ,Q l , ,Qk}), 



(17) 



where step (c) holds by the construction of v and step (d) holds by (fT3T ). On the other hand, the rate of 
the i-th transmitter-receiver pair with {Qi, • • • , Q", • • • , Qk} is given by 

RittQi,--- ,Q-',--- ,Qi^}) = log|l + *- 1 H ll QfHf| 



(e) 



log|*i + H«Qi'H£|-log|* 



log |*i + H 44 (Q^ + <5vv H )Hf I - log I* 



(/) 



> logl^ + H^Hfl-logl* 



log 



^({Qi,-- - ,Q-, ,Qk}) 



is) 



(18) 



i2i({Qi,--- ,Q*, ,Qk}), 

where step (e) holds by |I + A _1 B| = |A _1 ||A + B|, step (/) holds by LemmaQ] and step (g) holds by 
(fl2l ). This contradicts our assumption that the set (Qi, • • • , Qj, • • • , Qk) of transmit signal covariance 
matrices is Pareto-optimal. Therefore, we have 



C(Q*) c C([V 



Next, suppose that C(Qj) C C([V^,--- ,V^J) but tr(Qj) < Pi. Then, by the same argument as 
before, there almost surely exists v such that H^v ^ and H^v = for all j ^ i, when N > J2i Mi- 
Let 



Qi = Qi + 5vv H , (19) 

where 5 is chosen to be 5 = (^Pi — tr(Qj)^ so that tr(Qi) = Pi- Then, the rate of the j-th 

transmitter-receiver pair (J ^ i) does not change by the same argument as in (fTTT ) and the rate of the 
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i-th transmitter-receiver pair strictly increases by the same argument as in (fT8l ). Thus, in the case of 
N > Yli Mi, each transmitter should use full power for Pareto optimality. 

Now, consider the case oi N < Y,f=x Mi. In this case, C([v| i ,--- , v|j) = for randomly 
realized channel matrices and ((6]) is trivially true. Finally, CQV^, • • • , V^-J) = C([H^, • • • , H^J) by 
the definition of vL (See ©.) ■ 

J 1 

Lemma 1: Under the same conditions as in Theorem [1] we have 

log |* 4 + H 44 (Q' 4 + 5vv H )Hf | > log |* 4 + H^Q^Hf |. (20) 

Proof: First, consider the difference: 

(*< + + Jvv^Hf ) - (*i + HjjQ-H^) = 5H iiV v H Hf 

^ 0. 

Thus, *j + Hjj(Q^ + 5vv H )Hf ^ *j + HjjQ'-Hf . This implies that the ordered eigenvalues of 
*i + H,;,;(Q- + 5vv H )Hf majorize those of ^j + H^Q-Hf . That is, let A' fc ' be the £>th largest eigenvalue 
of $j + H^Q- + <5vv^)Hf and let X' k be the fe-th largest eigenvalue of *j + H^Q-Hf . Then, 

Afe > X' k , V k. (21) 

Next, consider the difference of the traces of the two matrices: 

tr + + <Jw H )H£) - tr (*, + H^Hf ) = 5tr (H^vv^Hf ) 

= <5||H^v|| 2 

> (22) 

by the construction of v satisfying Hj/v ^ 0. By (l2Tb . (l22l and the fact that the trace of a matrix is the 
sum of its eigenvalues, there exists at least one eigenvalue A' fc ' that is strictly larger than X' k . Therefore, 
we have 

|*i + H«(Qj + «Svv")Hg I > |*t + H^Hf | 

since the determinant of a matrix is the product of its eigenvalues and both the matrices are strictly 
positive-definite due to the added identity matrix in 3>j, i.e., A'^ > A^, > 0, V k. Finally, d20l follows by 
the monotonicity of logarithm. ■ 

Theorem Q] states that the column space of any Pareto-optimal transmit signal covariance matrix at 
transmitter i should be contained in the union of the parallel spaces of the channels from transmitter i to 
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all receivers. In the case that Mi = 1 for all i = 1, • • ■ , K, the parallel space is simply the 1 -dimensional 
linear subspace spanned by the matched filtering vector. Thus, this result in Theorem Q] can be regarded 
as a generalization of the result in the MISO interference channel by Jorswieck et al. O to general 
MIMO interference channels described by the A'-pair (N,M\, ■ ■ ■ ,Mk) interference channel model. 

A. The Symmetric 2-User Case 

In this subsection, we consider the symmetric two-user case and present another representation for 
Pareto-optimal transmit signal covariance matrices in this case. 

Corollary 1: In the two-user case in which the number of receive antennas is the same (M = M\ = 
M 2 ) and N > 2M, any Pareto-optimal transmit signal covariance matrix at transmitter 1 should 
satisfy 

C(Qi) CC([vf 1 ,n v .vf 1 ]) =C([vj 1 ,v| 1 ]) (23) 

and tr(QS) = P u where n^V^ = (V 2 J i V 2 J - 1 i? )v} 1 (= V^). 

Proof: The proof is by showing the equivalence of the two subspaces: 

C([v! 1 ,(V^ 1 V 2 L 1 ^)vf 1 ]) =C([V! 1 ,V| 1 ]). (24) 

Any vector in C([v| 1 , V^-J) of the right-hand side (RHS) of (l24l can be expressed as 

Vj ix + V| iy (25) 

for some x,y € C M , whereas any vector in C([vf 1 , (V^ 1 V^ 1 H )vf 1 ]) of the left-hand side (LHS) of 
(l24l can be expressed as 

vf 1 x' + (V 2 J 1 V 2 ^)vf 1 y' (26) 
for some x',y' € C A/ . Eq. ( f26b can be rewritten as 

vf 1 x'+(V 2 L 1 V 2 L 1 ^)vf 1 y' 
=V| 1 X' + (I-Vl 1 vlf)v| 1 y' 

=V| 1 ( X ' - y') - vfiCvSf Vj^y'. (27) 
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Furthermore, V^V^ G £,MxM is invertible almost surely]^ Thus, there exists an isomorphism between 
(x,y) and (x',y') given by 

y' = -(Vjf Vf^-V 

x' =x + y' (28) 
= x-(vSfvf 1 )- 1 y 

to satisfy 

vf lX + Vly = Vl(x' - y') - vf^vlf vjjy'. (29) 

Thus, the two subspaces are equivalent, i.e., C([vf 1 , IIv^ vf-J) = C([v| 1 , V2J). Since C(Q*) C 
C([vf 1 , V^]) by TheoremQ] the claim follows. ■ 

As in the MISO case 0, the Pareto-optimal beam space C(Q^) is contained in the union of the self- 
parallel space of C(V"| 1 ) and the vertical or zero-forcing space C(n V J- vfi) of the channel to the other 
user in the two-user symmetric MIMO case. 

B. Parameterization for the Pareto-Optimal Beam Structure in MIMO Interference Channels 

Theorem Q] provides a necessary condition for Pareto-optimal transmit signal covariance matrices for 
the i\~-pair Gaussian (N,M\, ■ ■ ■ ,Mk) MIMO interference channel with interference treated as noise. 
Based on Theorem [T] in this section, we develop a concrete parameterization for Pareto-optimal transmit 
signal covariance matrices for the isT-pair Gaussian (TV, Mi, • • • ,Mk) MIMO interference channel for 
construction of a very efficient beam design algorithm in the next section. Here, we mainly focus on 
the case of N > YliLiMi, although the parameterization result here can be applied to the case of 

Since C(Q*) C C([vf j , • • • , V^J) = C([Hf[, • • • ,Hf J), any Pareto-optimal transmit signal covari- 
ance matrix Q* at transmitter i can be expressed as 

Qi = [ H il, • • • , H^]Xj [H^, • • • , ~H.Ki\ H , (30) 

where Xj is a (^,- Mj) x Q]\ Mj) positive semi-definite matrix with rank less than or equal to Mj. Note 
that [H^, • • • , H^J is a N x (£V Mj) matrix and it has full column rank almost surely for randomly 

5 vf 1 and Vjj are the parallel spaces of Hn and H21, respectively. The event that Vj^vjjj € C MxM is non-invertible 
requires that C(vf 1 ) is contained in a strict subspace of C N with dimension less than N determined by V|i. Such an event 
has measure zero for randomly realized channel matrices. 
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realized channels 



Let the (skinny) QR factorization of [Hf , • • • , Hf J be 

[ H f , • • • , H^] = TjRj, (31) 



where Tj is a N x £\ -^i matrix with orthonormal columns and Rj is a Mi) x (£\ Mj) upper 
triangular matrix. With the QR factorization, the Pareto-optimality subspace condition © can be rewritten 
as 

Q* = T 4 X^Tf , (32) 

where X. is a (£\ Mj) x (£\ Mj) positive semi-definite matrix with rank less than or equal to Mj. Since 
X- is Hermitian, i.e., self-adjoint, by the spectral theorem, it has the spectral decomposition given by 

X^ = UiAiUf, (33) 

where Uj is a (£V Mi)xMj matrix with orthonormal columns, i.e., Uf Uj = I and Aj = diag(Aji, ■ ■ ■ , AjM* 
is a Mi x Mj diagonal matrix with nonnegative elements, i.e., A.^ > for all k. Thus, any Pareto-optimal 
transmit signal covariance matrix at transmitter i is expressed as 

Q* = TiUiAiUf Tf , (34) 

which is a spectral decomposition of Q* since (YjUj)-^ (YjUj) = I. Note here that Tj is known to 
the transmitter under the assumption of known channel information and fixed for a given set of realized 
channel matrices {Hjj}. Note also that (l34l incorporates the condition ((6]) of Theorem [TJ only. In the 
case of N > ^ Mj, we have the full transmission power condition (0 additionally. Applying this full 
power constraint to d34l , we have 

Pi = tr(Q*) 

= tr(T,U l A i Uf Tf ) = tr(AiUf Tf TiUi), (T i U i )^(T i U i ) = I 

Mi 

= tr(Aj) = ^A ife , (35) 

k=l 

where > for all k. Thus, any Pareto-optimal transmit signal covariance matrix can be parameterized 
by the two matrices Uj and Aj with constraints Uf Uj = I and tr(Aj) = Pj, respectively. Especially, 
Uj's satisfying Uf Uj = I form a special subset of C^ M, ) xMl called the Stiefel manifold V^.m^m^ 

6 The full column rank assumption is not necessary. In fact, the complexity of the beam design problem is reduced when the 
matrix does not have full column rank. This step will be explained in Algorithm Q] in Section llVl 



November 20, 2012 



DRAFT 



SUBMITTED TO IEEE TRANSACTIONS ON SIGNAL PROCESSING, NOV. 17, 2012 



13 



Definition 1 (Stiefel manifold): The (compact) Stiefel manifold V n>p (or V^(C n )) is the set of all n x p 
complex matrices with orthonormal columns, i.e., 

V n , p := {U G C nxp : XJ H XJ = I p }. (36) 

Note that C nxp is a vector space over C with the normal matrix addition and the scalar multiplication as 
vector addition and scalar multiplication. The Stiefel manifold V n>p is a submanifold of the vector space 
C nxp lfl5l . Now, we present our parameterization result for Pareto-optimal beamforming in the A'-pair 
(N, Mi, • • ■ , Mk) MIMO interference channel when N > £\ Mj in the following theorem. 

Theorem 2: Any Pareto-optimal transmit signal covariance matrix at transmitter i for the K-pair 
Gaussian (7Y, Mi, • • • , Mr-) MIMO interference channel with N > J2i Mi is completely parameterized 
by the product manifold M.i\ 

M i :=V EiMi ,Mi X ' H M i , (37) 

where V^.k-u.nu i s tne Stiefel manifold of orthonormal Mj-frames in C^> A/i and %Mi is a subset in 
the first quadrant of a hyperplane in the Euclidean space defined by 

H Mi ■= {(Ai, • • • , X Mi ) : A, > and J] \ = P f }. (38) 

i 

Proof: Combining Theorem [TJ and equations (l32l ). (l33l) . (l34b and (l35l ). we have the result. ■ 

Note that Mi is an embedded manifold within the original high dimensional space Qj. The main advantage 
of the parameterization in Theorem[2]is that the dimension of the parameter space (or beam search space) 
not losing Pareto-optimality does not depend on the number N of transmit antennas when N > M, and 
the proposed parameterization significantly reduces the dimension of the beam search space as compared 
to the original search space Qj, when N » ^Mj. Thus, the proposed parameterization is useful 
for upcoming cellular downlink cooperative transmission with massive MIMO technologies |[T2ll . |fT3l in 
which large-scale transmit antenna arrays are adopted at basestations while each mobile station still has 
a limited number of receive antennas. The exact dimension of the parameter space M.i for transmitter i 
is given by 

K 

D{Mi) = 2(J2 M i) M i ~ ( M if + ( M i ~ !)• 09) 

i=l 

This is because the dimension of V n>p is given by 2np — p 2 lfl5l and the dimension of H n in M. n is 
given by n — 1. In addition to the independence of the parameter space dimension on the number of 
transmit antennas, the parameterization in Theorem [2] enables us to exploit the rich geometrical structure 
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of Stiefel manifolds and hyperplanes for optimal search for beam design. This will become clear shortly 
in the next section. 

Now, consider the case that N < J2i In this case, Theorem Q] is not so helpful, but a parame- 
terization similar to that in Theorem [2] can be obtained by directly applying spectral decomposition to 



where Uj is a N x M% matrix with orthonormal columns, i.e., U^Uj = I^f. and Aj is a M% x M% positive 
semi-definite diagonal matrix. Thus, the parameter space is given by M\ := V/v,Af, x ?^>Af«> where HSm* 
is a subset of a half space of R Mi , defined as US M% ■= {(Ai, • • • , A M J : K > and £\ A* < Pi}. 



In this section, we provide an efficient beam design algorithm under the parameterization Mi containing 
all Pareto-optimal beamformers in the previous section by exploiting the geometric structure of the 
parameter space. Here, we consider a centralized beam design approach under the assumption that 
all channel information is available. For example, in cellular systems, all channel information from 
cooperating basestations can be delivered to the basestation combiner (BSC), and the BSC can compute 
beamforming matrices for all the basestations under its control and inform the computed beamforming 
matrices to the basestations under its control. When fast communication between the BSC and the 
basestations is available, such a method can be used in practice. 

A. The Overall Algorithm Structure: A Utility Function-Based Approach 

Our approach to beam design is based on the utility function based method in JSJ, |[T6*1 . In this approach, 
we define a utility function u: 



The utility function is chosen to represent the desired system performance metric. We assume that u 
is a bounded smooth function of (R\, • • • , Rk)- In addition, due to Theorem [2 we have the following 
mapping: 




IV. The Proposed Beam Design Algorithm 



u : !->■ R : (R u ■ ■ ■ 



Rk) !->• u. 



(41) 



r : Mi x • • • x Mr ^ (#1, ■ ■ ■ 



Rk) 



(42) 
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which is determined by the rate formula (01) and Qj = YjUjAjU^Y^. Here, we only need to consider 
Mi x • • • Mk as our beam search space owing to Theorem [2] The composition of the two mappings is 
given by 

u := u o r : M\ x • • • x Mr >-> R- (43) 

Note that this mapping is the desired mapping from the beam search space containing all Pareto-optimal 
beams to the set of utility values and that u is a smooth function on the product manifold M.\ x • • • x M.r 
by the smoothness assumption on u and the smoothness of the rates as functions of {Qj}. Then, the 
utility-maximizing beam design problem is formulated as 

max u(Ui, Ai, • • • ,U K , Ax-), (44) 

where A^j is given by (137T ). Although simultaneous optimization of (Ui, Ai, • • • , Ujf, A^) to maximize 
the utility function is difficult, the optimization (l44l can efficiently be solved by an alternating optimization 
technique. That is, we fix all other {(Uj, Aj), j ^ i] except (Uj, Aj) and update the unfixed parameters 
(Ui, Aj) in order that the utility function is maximized. After this update, the next (Uj, Aj) is picked for 
update. This procedure continues until it converges. The proposed overall algorithm is described below. 

Algorithm 1: The Proposed Beam Design Algorithm - The Overall Structure 
Requirements: 

• Channel information {Hy,i, j = 1, • • • ,K} 

• Maximum available transmit power {Pi, • • • , Pk} 

• Utility function u(Ri, ■ ■ ■ ,Rr) 

• Stopping tolerance e > 
Preprocessing: 

. Obtain Yj by QR factorization of [Hff , • • • , HfJ =: Hj as in (FSTT) for all i = 1. • • • , K. 

• In the above QR factorization step, the rank of Hj is revealed. Based on the revealed ranlu mi, set 
the number of rows of Uj as mi and set the number of its columns as Mj. In this step, the proper 
Stiefel manifold for Uj is identified and it is V m „Mi- 

Iteration: 
Initialization: 
,1 = 

7 If Hi is not of full column rank, the problem size simply reduces. 



November 20, 2012 



DRAFT 



SUBMITTED TO IEEE TRANSACTIONS ON SIGNAL PROCESSING, NOV. 17, 2012 



16 



(0) 



. u 



while 







and A.- = ^-Im, for all i = 1, 



K 



^({(ufUfji-uiKurur} 



rP-l) aP- 1 )" 



> e 



/ = / + l; 
for i = 1, 



K 



(/+!) .(1+1)^ 
i i^i ) 



(u. 

end for 
end while 
Postprocessing: 

• Check the rank of A 



arg max 

(U 4 ,A 4 )eMi 



KUi,Ai 



- U ^ A ^)l{(u,A j )=(nf,A'"),i^ ( 45 ) 



(l,to P ) 



to determine the number d; of data streams for transmitter i. 



Construct a beamformer matrix Tj for transmitter i as 

I\ = TjUf to "\:, 1 : ^)A/diag(Aj to,>) , 



(46) 



where d, = rank f A 



and Xjf Btop \:, 1 : dj) is the matrix composed of the first di columns of 



U 



(lsta P ) 



• At transmitter i, generate di zero-mean i.i.d. data streams and construct the dj X 1 data vector dj 
with the generated data streams. Then, construct the signal vector Sj = Tjdj and transmit through 
antennas. Then, the signal vector Sj has the desired signal covariance matrix Qj. Typically, di i.i.d. 
data streams are from di independent channel encoders. 

There are several interesting features about the proposed beam design algorithm. 

• First, it is not necessary to predetermine the number of data streams for the algorithm. Although there 
exist some asymptotic results on optimal DoF at high SNR [1], the optimal number of independent 
data streams for transmission is not known for finite SNR in most cases except the known fact that 
the maximum DoF for transmitter i is Mj. Our parameterization for the beam search space includes 
all possible DoF values less than or equal to Mj. Thus, if the algorithm works properly, the algorithm 
will find the optimal DoF for given SNR automatically. When the full DoF Mj is not optimal, the 
algorithm would return (Aji, • • • , AjM») on a corner or an edge of %Mi- 

• Any transmit signal covariance matrix Qj can be implemented by a beamforming matrix Tj as in 

{33. 
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• Due to the non-convexity of utility functions with respect to (w.r.t.) {Qi} (note the rate formula (H}), 
the convergence of the proposed algorithm to the global optimum is not guaranteed, but the proposed 
algorithm converges to a locally optimal point by the monotone convergence theorem, if the step 
(|45T ) works properly, i.e., at each iteration it finds a better point in Mi than the current point. This 
is because the utility function is upper bounded and the proposed algorithm yields a monotonically 
increasing sequence of utility function values under the assumption of proper operation of the step 
(|45T ). Furthermore, in this case the proposed algorithm is stable since it monotonically converges. 
Thus, an efficient and successful implementation of the step (l45l ) is critical to the overall beam design 
algorithm. Such an implementation is possible and available because of the geometry of our parame- 
terization Mi. The problem ( f45T ) involves optimization on a Stiefel manifold, which is well established 
H21, ifTTl . In the next subsections, we briefly introduce some basic facts about Stiefel manifolds and then 
present our algorithm implementing (145T ) based on the steepest descent method or the Newton method 
on Stiefel manifolds of Edelman et al. IfTTl . 

B. Preliminaries: Riemannian Geometry on Stiefel Manifolds 

Since geometry of hyperplanes or half spaces is simple, we here provide some basic facts about the 
Stiefel manifold V n>p that are necessary to understand the subalgorithm implementing the step (|45T ). For 
a detailed explanation of the Stiefel manifold and its geometry, please refer to |fl5l . IfTTl . For general 
Riemannian geometry, please refer to lfl"8l , |[T9l . 

Tangent spaces: The tangent space TjjV n:P at a point U S V n ,%> is given by 

TuK, P = { A : A H U + U H A = 0} (47) 
= {UA + UiB : A H = -A, B e C (n " p)xp }, (48) 

where XJXJ H + Uj_U^ = I n . That is, a tangent vector at U is a n x p matrix A s.t. XJ H A is skew- 
Hermitian. 

The canonical metric: For two tangent vectors Ai and A2 in T\jV n)P , the canonical metric is defined as 

S c (Ai, A 2 ) = Re jtr (a? (I - ^UU ff )A 2 j J (49) 

Geodesies: A geodesic on a manifold is a curve on the manifold whose velocity vector field is constant 
along the curve w.r.t. a given affme connection. A geodesic formula for the Stiefel manifold V njP w.r.t. 
the Levi-Civita connection is given by the following theorem by Edelman et al. IfTTl : 
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Theorem 3 (Edelman et al. /fTTI/): Let U be a point in V np and A be a tangent vector in TtjV^ iP . 
Then, the geodesic on the Stiefel manifold emanating from U in the direction A is given by the curve 

U(() = UM(t) + QN(i), (50) 

where 

QR=(I-UU H )A (51) 

is the skinny QR decomposition of (I — UU ff )A with Q being n x p and R being p x p, and M(i) 
and N(i) are p x p matrices given by the following matrix exponential 



M(t) 


= exp 


' A -R H ' 










) 


_ N(t) _ 




R 








(52) 



where A = A. 

Gradient: For a smooth function / on the Stiefel manifold, i.e., / : V njP — > M, the gradient of / at U 
w.r.t. the canonical metric is defined as the tangent vector grad/ G TuV n>p satisfying Re {tr(/jj A)} = 
g c (grad/, A) for all tangent vectors A at U, where /u is the nxp matrix composed of partial derivatives 
of / w.r.t. the elements of U, i.e., [fv]ij = Tjfj - - ^ ne gradient of / at U is given by 

grad/ = /u - U/#U. (53) 

Hessian: For a general Riemannian manifold (S,g), the Hessian operator of a smooth function / at a 
point q € S is defined as a linear operator: Hess/ : T q S — > T q S with Hess^(v) = V v grad/ for all 
v G T q S, where V is the Levi-Civita connection on S. Just as in the Euclidean case, a smooth function 
on S admits Taylor expansion [15]. Let f q := / o R q , where R q is a retraction^ Then, in a neighborhood 
of q, we have 

/(v) « /(g) + <7(grad/,v) + ^(Hes S/ (v), v). (54) 

Thus, the stationary point v* of the RHS of d54l) satisfies the Newton equation: 

Hess/ ( v* ) + grad/ = 0. (55) 

The Hessian operator can be computed for complex Stiefel manifolds as well as real Stiefel manifolds. 
For detail, please refer to fTTI and j20l . 

8 A retraction R q is a smooth mapping from T q S to S with R q (0) — q and dR q \o is an identity map, where dR q is the 
differential of R q . The exponential map exp^ is an example of retraction. 
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C. The Subalgorithm: Steepest Descent or Newton Method on the Product Manifold 

Notice that the cost function u in (|45T ) is a smooth mapping from Vv.M^Mi x ^-K'h to R when 
{(Uj , Aj), Vj / i} are fixed. By exploiting the product structure of the parameter space, the optimization 
problem d45l ) can be solved by an alternating technique again. That is, first we fix Aj and update Uj by 
the steepest descent or Newton method on the Stiefel manifold Vj^.M^Mi CGL H3- Next, we fix Uj 
and update Aj by the steepest descent or Newton method on 'Hm v We continue this iteration until we 
have satisfactory convergence. The subalgorithm implementing the step d45l ) is given below. 

Algorithm 2: The Subalgorithm for d45l ) 
Requirements: 

• Cost function u(Uj, Aj). Set / = —u. 

• Step sizes t\ and T2 

• Stopping tolerance e' 
Initialization: 

. k = 

while |w(Ui >(A .),A ii(fe) ) — u (}3i,(k-\)i A-Hk-i)) \ > e ' 

k = k + l; 
U step 

Fix A,/. Given the current Uj G V^.m^m^ 

1. Compute the movement direction vector D G Tjj, (fc) . m s ,m s - 

* For the steepest descent method, D := — grad/ in (I53T ). 

* For the Newton method, compute D as in fT71 , |20ll . 

2. Move from Uj ^ to exp U; . (riD), where exp Ut is the exponential map at Uj That 
is, move from Uuk) i n direction D to U(ri) in ( [50T > along the geodesic given by Theorem [3] 
Then, Uj i(fe+1) = U(n). 

A step 

Fix Uj. Given the current Aj G 7/m<> 
1. Compute the movement direction vector rj. 

* For the steepest descent method, compute the gradient vector g of /(Aj) at A-iJk), and 
V ■= -g- 
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* For the Newton method, compute the Hessian matrix H of /(Aj) at A^), and rj := — H *g 

2. Obtain the projection P^ A HM i (v) °f V to tne tangent space T\ t w ?iMc 

3. Move from A.uk) to tne direction r\ on Hm,- That is, 

A i)(jfc+1) = P« M . [A i)(fe) + r 2 PT A . ((!) w Mj (»/)]■ (56) 

end while 

• (Uf +1 , Af +1) ) = (U i](fcstop) , A ii(fcstop) ) 

The step 3 in the J7 step is to maximize the utility with the constraint that the points still stay in 
the Stiefel manifold Vj^. Mi,Mr Note that for the A step, u(Aj) is a conventional multi- variable scalar 
function, i.e., it is u(\n, • • • , A^mJ- Thus, the ordinary gradient vector and the ordinary Hessian matrix for 
a function defined on a Euclidean space are valid. Furthermore, the A step is simple since a hyperplane 
is flat and thus its geometry is induced by projection from its embedding Euclidean space. In (l56l ). 
Aj ( fc ) + ~Pt a . k H M . (H _1 g) is still on the hyperplane containing but it may be outside (i-C, 
not in the first quadrant). Projection back to Hm, can be done by simple scaling of P^ A (H _1 g) 
after checking the coordinate values of A, r k \ + P^ A Hm- (H _1 g). That is, if there exists a negative 
value at some coordinate, P^ A {k) H M . (H _1 g) is scaled down and then added to Aj ^ so that the value 
at that coordinate becomes zero. 

An attracting aspect of the steepest descent method and the Newton method on the Stiefel manifolds 
is that their local convergence is established lTT5l . Thus, Algorithm [2] has the local convergence property 
and therefore, the overall algorithm, Algorithm [TJ has local convergence. Furthermore, the complexity 
of the subalgorithm is not prohibitive. Formulas for /u and /uu can be precomputed and stored for 
typical utility functions. The matrix exponential in d52l involves a matrix with small size (2Mj) x (2M;). 
There exist even simpler alternative ways to generating a curve with a given tangent vector other than 
the geodesic lfl31 . (2TJ. The subalgorithm presented here is only one example among many possible 
implementations for optimization on Stiefel manifolds and a variety of different methods are available to 
compromise complexity and performance lfT51 , |[20l . 

D. A Design Example: Weighted Sum Rate Maximization 

In this subsection, we provide a specific example for the proposed beam design method. Here, we 
consider the cooperative beam design for weighted sum rate maximization by using the steepest descent 
on the product manifold M.\ x • • • x A4k- The weighted sum rate maximization problem is formulated 
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as 



K 

max y WiRi 

{V k ,Ak,k=l,-,K} ^ 



(57) 



i=i 



K 



mm - y; Wi log i + ( i + y; ) h^q,h 



-1 



where Q; = TjUjAjU^T^ and {wi : w% > 0, £^ tOj = 1} is the set of weighting factors. To compute 
the gradient of the objective function, we manipulate the rate formula of receiver i as follows. First, 
consider the case of k = i. 



Ri 



log 
log 
log 



I + (I + ^ II AllfJ) 'H /Q H 

I + ^2 HijQjH^ + HjjQjH^ 



log 



I + ^HfjQjH^ 



constant 



=:A fcfc Af 



log 
log 

log 
log 



ri? a H 

Y fcfc 



1 + A fcfe lH fcfeQfc H fcfe A 

I + A^H fcJfc T fc U fc A fc Uf TfHf fc A- 

:C 



constant 



constant 



I + U^C/cfcCj^UfcAfc 



11 



— Cfe 
— constant 

+ log I Ad 



constant. 



Thus, the (Wirtinger) derivative of Ri w.r.t. U£ for k = i is given by [22 



(58) 
(59) 

(60) 



For the gradient of Ri w.r.t. A*., we only need to consider the diagonal elements of A& = (A^i, • 
since the off-diagonal elements are fixed to zero. In (T58T ), define Ckk '■= C^Ufc. In the case of i = k, 
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from (l58l ). the gradient of Ri w.r.t. \ k i for I G {1, • • ■ , is obtained as 



A7 



dXki 
tr 



log 



constant 



I + CfcfcAfcC^,) ( -— (I + C fefc A fc Cf fc ) 



tr 



tr 



\d\ki 



I + Ajt/Cjt fe (:, l)C kk (:, I) 



•,H 



-1 



CfcfeOjOl 1 + CfcfeAfcC^.) Cfcfc(:,Z), 



Cfefe(: ) 0C fefc (:,0 
i 



(61) 



where C kk (:,l) is the Z-th column of Cfcfc and the second equality is from |[22l . Therefore, 

= [cf fe (:, 1) (i + C fcfc A fc C&) _1 C fcfc (:, 1), . . . , C&(:, M k ) (i + C fcfe A fe Cf fe ) "'c^:, M; 



(62) 



Next, consider the case of i ^ k. In this case, 
Ri = log 



log 
log 

log 



I + (I + Hij QjH^y 1 Hu QiHf 
I + ^ H /Q II/;' + H^QiHf - log I + ^ Hij-QjH^ 



I + ^HyQ.-Hf +H ife Q fc H| - log 1 1 + HfjQj-Hg+HifcQfcH^ 



:A, k Af 



I + A^U lk Q k Uf k A- H 



jk ik 



log 



1 + B ifc lH ifcQfc H ifc B 



H-a-H 

ik 



+ constant 



log 

log 
log 



I + A^HifcT* U fc A fe Uf Tf HgAr* 



log 



I + B-^T* U fc A fc U^ Tf HlB7« 



I + U^CjfcC^UfcAfc 
A^+UfCifcC^Ufc 



log 
■ log 



I + Uf D ifc D^U fc A fe 
A^ + UfDrtDlUfe 



=:Dg =D !fc 

+ constant 

+ log |AJ — log |Afe| + constant. 



+ constant 

(63) 
(64) 



From (l63l and (l64l . the derivatives of Ri w.r.t. U£ and Aj. are respectively given for i ^ k by 

ORj 
dV* k 

and 

dRi 



Uf C ijfc C|U* (I + Uf C ifc C^U fe A fe ) 1 - Uf DifcD^Ufc (I + U^D jfe D^U fc A fc ) 1 (65) 



C&(:,l)(l + C ifc A fc c£) 1 C 4fc (:,l)-D^(:,l)(l + D a ,A fc Di N ) 'f^:, 1), • • • , (66) 



C&(:, M fc ) ( I + C ifc A fc Cf fe ) C lk (:, M k ) - D&(:, M fc ) ( I + D ifc A fc Df fe ) D ifc (:, Af fc ), 
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where Cik = C^Ufc and Djfc = D^Ufc. Then, the derivatives of the overall cost function w.r.t. U£ and 
Afc are given respectively by 

q k K dR 

OTjj(E^)=E M *OTjj (67) 

and 

i=l i=l 

for = 1, • • • , K . With the obtained derivatives, Algorithm Q] with the subalgorithm, Algorithm |2l can 
be run. 

V. Numerical Results 

In this section, we provide some numerical results to validate our beam design paradigm based on the 
parameterization {M.%} for the beam search space for MIMO interference channels. We here consider 
the weighted sum rate maximization problem proposed in Section IIV-DI 



12 




2 4 6 8 10 

Number of iteration 



Fig. 1. Convergence of Algorithm Q] K = 3, (N, M 1: M 2 , M 3 ) = (8, 2, 2, 2), and Pi = P 2 = P 3 = 30. 



First, we verified the convergence of the overall algorithm. Fig. Q] shows the convergence behavior of 
Algorithm Q] for several different channel realizations when K = 3, (N, Mi, M%, M3) = (8,2,2,2) and 
Pi = P2 = P3 = 30. Here, we used the steepest descent method on the product manifold M., L with step 
sizes 



0.05 







K 

E 



WjR, 



and 0.05 







dA k \ ^ 



K 



WjR, 



i=l 



(69) 
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for the U and A steps in our subalgorithm. The step sizes in (|69l ) are designed to gradually reduce to zero 
as the subalgorithm approaches a (locally) optimal point and to show better convergence behavior near 
the locally optimal point. It is observed in the figure that the overall algorithm converges very fast and 
the number of iterations for convergence is only a few for most channel realizations in this case. Thus, 
the main computational time lies in the execution of the subalgorithm. Although the steepest descent 
based subalgorithm is used in this demonstration, different methods with faster convergence can be used 

ma, m, ed. 

With convergence of the algorithm confirmed, we examined the sum rate performance of the proposed 
beam design algorithm. Figures [2] (a) and (b) show the rate-tuples of several beam design methods 
for two different channel settings. We considered the single-user eigen-beamforming, the zero-forcing 
beamforming in addition to the proposed beam design method. For the proposed beam design method 
for weighted sum rate maximization, we varied the weights so that we can obtain rate-tuples at different 
locations. As expected, it is seen that the rate performance of the proposed method is superior to those 
of the eigen-beamforming and the zero-forcing. Of course, the weighted sum rate maximization can be 
performed in the original beam search space Q\ x • • • x Qx by using one of gradient descent type 
algorithms. However, such a method is far less efficient than the proposed beam design method based 
on the proposed parameterization for the beam search space not losing Pareto-optimality. 
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Fig. 2. Rate pairs of several beam design methods: (a) K 

(A/, Ni,N a ) = (5, 2, 2), Px = P 2 = 10 



2, (N,Mi,M 2 ) = (6,2,2), Pi = P 2 = 5 and (b) K = 2, 
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Finally, Fig. [3] shows the sum rate performance of the algorithm w.r.t. SNR for three different system 
parameter settings: (1) K = 2, (7Y, Mi, M 2 ) = (5, 2, 2), (2) K = 3, (N, Mi, M 2 , M 3 ) = (8, 2, 2, 1), and 
(3) K = 3, (iV, Mi, M 2 , M 3 ) = (8,2,2,2). Table U summarizes the corresponding obtained rank of the 
converged Aj, i = 1, • • ■ , A", by the proposed beam design algorithm. Note that in the low SNR regime 
indeed the proposed beam design algorithm does not yield a beamformer with the maximum available 
DoFs of Mi for all i. Futhermore, it tells who should not use the available (single-user) full DoFs for sum 
rate maximization. It is expected that at low SNR the optimal strategy does not use maximum DoFs since 
all power can be allocated in the best direction, as in the single-user MIMO case. Due to the separate 
parameterization for Uj and Aj in the beam search space Mi, our algorithm can clearly identify the 
optimal rank of the beamforming matrix by checking the rank of Aj. 
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Fig. 3. Performance of the proposed method: Sum rate with respect to SNR 



VI. Conclusion 

We have considered the Pareto-optimal beam structure for multi-user multiple-input multiple-output 
(MIMO) interference channels and have provided a necessary condition for any Pareto-optimal transmit 
signal co variance matrix for the A'-pair Gaussian (7Y, Mi, • • • , Mk) interference channel. We have shown 
that any Pareto-optimal transmit signal covariance matrix at a transmitter should have its column space 
contained in the union of the eigen-spaces of the channel matrices from the transmitter to all receivers. 
Based on this necessary condition, we have proposed an efficient parameterization for the beam search 
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TABLE I 

The obtained number of data streams for Fig. [3] <f, in (di, d 2 , tfe) in the table denotes the obtained rank of 

Ai FOR TRANSMITTER-RECEIVER PAIR i 



space, given by the product manifold of a Stiefel manifold and a subset of a hyperplane. Based on the 
proposed parameterization, we have developed a very efficient beam design algorithm by exploiting the 
geometrical structure of the beam search space and existing tools for optimization on Stiefel manifolds. 
We hope that the results in this paper would be helpful for efficient intercell interference control based 
on MIMO antenna technologies in current and future cellular networks. 
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